Rank | Count | Beginning |
---|---|---|
5143 | 2774 | La |
3021 | 753 | Esas |
4125 | 747 | Homuli |
1850 | 698 | Demografio |
8653 | 401 | Po |
1540 | 302 | De |
2655 | 268 | Ek |
9295 | 237 | Segun |
9898 | 92 | Ye |
8423 | 88 | Ol |
2947 | 65 | En |
8046 | 64 | Lua |
4899 | 42 | Il |
1313 | 35 | Bazala |
2603 | 35 | Dum |
4084 | 28 | Historio |
8530 | 24 | On |
8928 | 22 | Pos |
8303 | 19 | Naski |
9732 | 18 | To |
3801 | 16 | Eventi |
5031 | 15 | Kande |
9653 | 15 | Ta |
8137 | 14 | Ma |
1171 | 13 | Altra |
8605 | 12 | Per |
4885 | 11 | Ica |
7983 | 11 | Lia |
7987 | 11 | Li |
8068 | 11 | Lu |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV